AITopics | imbalance dataset

Collaborating Authors

imbalance dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fake detection in imbalance dataset by Semi-supervised learning with GAN

Bordbar, Jinus, Ardalan, Saman, Mohammadrezaie, Mohammadreza, Ghasemi, Zahra

arXiv.org Artificial IntelligenceDec-20-2023

As social media continues to grow rapidly, the prevalence of harassment on these platforms has also increased. This has piqued the interest of researchers in the field of fake detection. Social media data, often forms complex graphs with numerous nodes, posing several challenges. These challenges and limitations include dealing with a significant amount of irrelevant features in matrices and addressing issues such as high data dispersion and an imbalanced class distribution within the dataset. To overcome these challenges and limitations, researchers have employed auto-encoders and a combination of semi-supervised learning with a GAN algorithm, referred to as SGAN. Our proposed method utilizes auto-encoders for feature extraction and incorporates SGAN. By leveraging an unlabeled dataset, the unsupervised layer of SGAN compensates for the limited availability of labeled data, making efficient use of the limited number of labeled instances. Multiple evaluation metrics were employed, including the Confusion Matrix and the ROC curve. The dataset was divided into training and testing sets, with 100 labeled samples for training and 1,000 samples for testing. The novelty of our research lies in applying SGAN to address the issue of imbalanced datasets in fake account detection. By optimizing the use of a smaller number of labeled instances and reducing the need for extensive computational power, our method offers a more efficient solution. Additionally, our study contributes to the field by achieving an 81% accuracy in detecting fake accounts using only 100 labeled samples. This demonstrates the potential of SGAN as a powerful tool for handling minority classes and addressing big data challenges in fake account detection.

fake detection, imbalance dataset, semi-supervised, (1 more...)

arXiv.org Artificial Intelligence

2212.01071

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)

Add feedback

Deep Learning for Bridge Load Capacity Estimation in Post-Disaster and -Conflict Zones

Pamuncak, Arya, Guo, Weisi, Khaled, Ahmed Soliman, Laory, Irwanda

arXiv.org Machine LearningFeb-5-2019

Many post-disaster and -conflict regions do not have sufficient data on their transportation infrastructure assets, hindering both mobility and reconstruction. In particular, as the number of aging and deteriorating bridges increase, it is necessary to quantify their load characteristics in order to inform maintenance and prevent failure. The load carrying capacity and the design load are considered as the main aspects of any civil structures. Human examination can be costly and slow when expertise is lacking in challenging scenarios. In this paper, we propose to employ deep learning as method to estimate the load carrying capacity from crowd sourced images. A new convolutional neural network architecture is trained on data from over 6000 bridges, which will benefit future research and applications. We tackle significant variations in the dataset (e.g. class interval, image completion, image colour) and quantify their impact on the prediction accuracy, precision, recall and F1 score. Finally, practical optimisation is performed by converting multiclass classification into binary classification to achieve a promising field use performance.

dataset, neural network, prediction model, (16 more...)

arXiv.org Machine Learning

1902.05391

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.95)
Health & Medicine (0.68)
Government > Regional Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Logistic regression on large imbalance datasets

@machinelearnbotMay-12-2017, 04:10:02 GMT

Hello, I am working on a highly imbalanced dataset (negative examples over 20K and positive examples about 100). I am trying to build a logistic regression model. My current approach includes undersampling of negative examples. However with this approach there are a couple of problems: 1) Several LR models are possible with different samples. How to generalize these models and interpret the output?

imbalance dataset, logistic regression, true positive, (1 more...)

@machinelearnbot

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback